On the Topic Discovery Using Query Logs and Hyperlink

نویسندگان

  • Chih-Ming Tseng
  • Yun-Fei Wei
  • Chiun-Chieh Hsu
چکیده

With the rapid growth of the World Wide Web, the amount of information in the Web has spawned on an unpredictable scale. Recently, most researchers attempt to use conventional information retrieval techniques to classify the search results. But not like the traditional document, the web page has distinct characteristics of its own. Therefore some researchers have begun to exploit the hyperlinks between Web pages. In this paper, we propose a topic discovery algorithm, which combines the query log and the hyperlink analysis. We use the query log to find the representative Web pages with respect to users’ endorsements, and combine a link-based clustering algorithm to cluster the similar topics. Each Web page is ranked according to search engine users’ endorsements and web creators’ endorsements. The experimental results show that our method performs better than the pure hyperlink analysis algorithm (ATD) in terms of topics discrimination and topic quality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measure of systemic risk in the interbank market in Iran by buffer capital and hyperlink-induced topic search algorithm

Considering that the interbank market is considered as a night market to provide short-term liquidity to banks, one of the most important risks in this market - due to the short-term nature of transactions in this market - is systemic risk. Exercising this risk cycle will have devastating effects on monetary policymakers, such as the 2007-2009 crisis.  In this study, first, the buffer capital ...

متن کامل

Analysis of User query refinement behavior based on semantic features: user log analysis of Ganj database (IranDoc)

Background and Aim: Information systems cannot be well designed or developed without a clear understanding of needs of users, manner of their information seeking and evaluating. This research has been designed to analyze the Ganj (Iranian research institute of science and technology database) users’ query refinement behaviors via log analysis.    Methods: The method of this research is log anal...

متن کامل

Mining Associative Relations from Website Logs and their Application to Context-Dependent Retrieval Using Spreading Activation

We have devized a methodology that mines sequential navigation patterns from a website's logs to enable us to identify the most signi cant associative links in the networks. Spreading activation can then be applied to the generated network of weighted hyperlinks enabling the content-dependent, semantic retrieval of nodes in the network. This approach to information retrieval avoids many of the ...

متن کامل

Discovering and understanding word level user intent in Web search queries

Identifying and interpreting user intent are fundamental to semantic search. In this paper, we investigate the association of intent with individual words of a search query. We propose that words in queries can be classified as either content or intent, where content words represent the central topic of the query, while users add intent words to make their requirements more explicit. We argue t...

متن کامل

Query Topic Classification and Sociology of Web Query Logs

In the paper, the objects, tasks, and a general procedure of the sociological analysis of Web search engine query logs are described and illustrated by a methodologically complete study of the cross-nation search image changes based on two-year spaced query logs of the national search audience.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006